A new transfer entropy method for measuring directed connectivity from complex-valued fMRI data

Background Inferring directional connectivity of brain regions from functional magnetic resonance imaging (fMRI) data has been shown to provide additional insights into predicting mental disorders such as schizophrenia. However, existing research has focused on the magnitude data from complex-valued fMRI data without considering the informative phase data, thus ignoring potentially important information. Methods We propose a new complex-valued transfer entropy (CTE) method to measure causal links among brain regions in complex-valued fMRI data. We use the transfer entropy to model a general non-linear magnitude–magnitude and phase–phase directed connectivity and utilize partial transfer entropy to measure the complementary phase and magnitude effects on magnitude–phase and phase–magnitude causality. We also define the significance of the causality based on a statistical test and the shuffling strategy of the two complex-valued signals. Results Simulated results verified higher accuracy of CTE than four causal analysis methods, including a simplified complex-valued approach and three real-valued approaches. Using experimental fMRI data from schizophrenia and controls, CTE yields results consistent with previous findings but with more significant group differences. The proposed method detects new directed connectivity related to the right frontal parietal regions and achieves 10.2–20.9% higher SVM classification accuracy when inferring directed connectivity using anatomical automatic labeling (AAL) regions as features. Conclusion The proposed CTE provides a new general method for fully detecting highly predictive directed connectivity from complex-valued fMRI data, with magnitude-only fMRI data as a specific case.


Introduction
To date, a huge number of studies have investigated directed functional connectivity (FC) or functional network connectivity (FNC) using fMRI data (Demirci et al., 2009;Stevens et al., 2009;Lizier et al., 2011;Ursino et al., 2020;Crimi et al., 2021).Directed FC/ FNC refers to the statistical causality between different time series from brain regions of interest (ROIs) or time courses of brain networks extracted by data-driven methods from fMRI data (Stevens et al., 2009;Ursino et al., 2020;Mahmood et al., 2022).The directed FC/ FNC results have been widely used as putative biomarkers to identify/ predict brain function changes linked to mental disorders such as schizophrenia (Fogelson et al., 2014;Bastos-Leite et al., 2015;Dietz et al., 2020).
Directed FC/FNC analyses can be generally classified into modelbased and model-free methods.Typical model-based methods include dynamic causal modeling (DCM) (Friston et al., 2019), structural equation modeling (Bielczyk et al., 2019), and dynamic Bayesian network (Wu et al., 2014).Regarding model-free methods, the Granger causal test is frequently used to determine whether there is a linear causal relationship between ROIs and brain networks (Demirci et al., 2009;Crimi et al., 2021).Demirci et al. (2009) exploited the Granger causal test to calculate directed FNC of fMRI data and found abnormal connections from frontal areas to visual areas for patients with schizophrenia.Crimi et al. (2021) used Granger connections to classify patients with autism spectrum disorder and healthy controls.
Real-valued transfer entropy is utilized to identify the underlying non-linearly directed information between ROIs or between brain networks (Lizier et al., 2011;Ursino et al., 2020;Liu et al., 2022).Ursino et al. (2020) verified that transfer entropy is a promising method to estimate the causality of connections between regions with long time delays.Lizier et al. (2011) presented a transfer entropy method to detect causality between brain regions in cognitive tasks and showed task difficulty being related to causal strength for the motor cortex.Following this, Liu et al. (2022) proposed a scored function based on transfer entropy and conditional entropy to quantify directed FC, which accurately inferred directed connectivity networks of time series.The most commonly used method for estimating real-valued transfer entropy is the histogram-based transfer entropy (HTE), which estimates the joint probability density function via a histogram-based function.Other transfer entropy algorithms were proposed to improve the accuracy of causal inference or noise robustness, including symbolic transfer entropy (STE) (Li and Zhang, 2022), effective transfer entropy (Behrendt et al., 2019;Caserini and Pagnottoni, 2022), Renyi transfer entropy (Jizba et al., 2022;Zhang et al., 2023), and phase transfer entropy (Wang and Chen, 2020;Gu et al., 2021).
Our study is motivated by two key points.First, previous studies show non-linear FC/FNC properties in fMRI (Li et al., 2010(Li et al., , 2011;;Motlaghian et al., 2023).The transfer entropy approach is designed to forecast non-linear causality (Schreiber, 2000), while the Granger causal test may fail as a linear model-free approach (Bastos and Schoffelen, 2016).Second, fMRI data are initially acquired as complexvalued image pairs including both magnitude and phase data (Calhoun et al., 2002;Rowe and Logan, 2004;Adali and Calhoun, 2007).A new transfer entropy approach is needed to incorporate unique and additional information from the phase data in addition to the magnitude-only fMRI data (Yu et al., 2015).The simple sum of the separate real-valued results from the magnitude and the phase data suffers from a loss of accuracy as there is also a correlation between the magnitude and the phase.As such, we propose a new complexvalued transfer entropy (CTE) to detect full causality between two complex-valued signals.
The main contributions of this study are 3-fold: 1. We propose a new CTE method to measure non-linear causal (directed) connectivity among two complex-valued signals by incorporating complementary causality between magnitude and phase using the partial transfer entropy, in addition to detecting magnitude-magnitude and phase-phase causality using transfer entropy.

Modeling and deviation of CTE
Figure 1 shows the framework diagram for measuring directed FC using CTE.Take two AAL regions AAL_n1 and AAL_n2 for example, each region can obtain an average complex-valued time series involving magnitude and phase.To quantify complete complex-valued causality, CTE measures magnitude-magnitude and phase-phase, and two parts of magnitude and phase causality.To guarantee the reliability of causality measurement, a causal significant test is performed.The direction of FC can be judged by the polarity of CTE.If the CTE value is positive, the direction FC points from AAL_n1 to AAL_n2; if the CTE value is negative, the direction is the opposite; if CTE equals zero, there is no directed FC between the two AAL regions.
We denote two complex-valued signals as and T is the data length.The two signals are represented with magnitude and phase in Eq. (1) as follows: where a ∈  T and T T  T are the magnitude and phase of z 1 , and b ∈  T and M M  T are the magnitude and phase of z 2 .
Based on the relationship between the magnitude and phase of the brain networks (Yu et al., 2015), we propose a definition of CTE considering complete causality between two complex-valued signals.
Motivated by the complex-valued mutual information introduced by Goebel et al. (2011), CTE combines the magnitude and phase to make causality inference and is represented as follows: where  a b → and  T T M M o are real-valued transfer entropy from the magnitude and the phase of the two signals, respectively. a b o T T and  T T M M o a are partial transfer entropy (Papana et al., 2012), which extends transfer entropy to account for the presence of the third variable.We extend to quantify the complementary phase and magnitude effects on the causality  a b → and  T T M M o .

Real-valued transfer entropy  a b
→ and  T T M M o in Eq. ( 2) can be calculated as follows (Schreiber, 2000): where a b → denotes causal direction from a to b, p is a marginal probability density function, In Eq. ( 4), it can be observed that phase θ θ and magnitude a are jointly used to determine the causal direction to magnitude b.In other words,  a b o T T incorporates magnitude-phase causality between θ θ and b by quantifying the complementary magnitude effects of a on the causality.Similarly,  T T M M o a can be calculated as follows: ,b are joint probability density functions.To estimate the joint probability density functions, we perform symbolic processing on each of the variables.The symbolic process helps to improve the noise robustness to traditional transfer entropy and helps capture more non-linear causality proved by the previous study (Gu et al., 2021).Taking the phase as an example, where superscript "T" represents the matrix transpose, the symbolic T t , 1 ≤ ≤ t T, denoted as T t is computed as follows (Wessel et al., 2000) Framework diagram for directed FC measured by CTE.First, average complex-valued time series from two AAL regions are obtained.Each average time series involves magnitude and phase.To quantify complete complex-valued causality, CTE considers four parts of causality, including magnitude-magnitude, phase-phase, and two magnitude-phase causality.After the causal significant test, the directed FC between two regions is measured by CTE.where β is a control parameter and set to be 0.05 according to the previous study (Wessel et al., 2000), and µ p and µ n are the mean of positive and negative variables of θ θ , respectively.As such, we obtain the symbolic variable vector of θ θ as T T . For simplicity, superscript "*" is omitted.ϕ ϕ can be symbolized in the same way.Magnitude a, b is symbolized only using Eq. ( 7) with non-negative values.
Then, we exploit a histogram-based method to estimate the joint probability density function by counting the number of common elements in segmented bins between vectors.Take ,b in Eq. ( 6), for example, T T t W is divided into k θ equal bins with the bin index denoted as i, and b t W is divided into k θ equal bins with the bin index as j.Denoting the segmented bin of T T t W and b t W as 'T and ∆b, the joint probability density function , is estimated by counting the elements number of T T t W and b t W within the segmented bin [ , ].
' ' T b The parameters of bin width 'T and ∆b are determined by the number of segmented bins and data length.For simplicity, the parameters 'T or ∆b are equal and can be selected as follows: Thus, the joint probability density function is represented as follows: where num b ' ' T , is the number of elements between T T t W and b t W within the segmented bin [ , ] ' ' T b around i j , .

Significance test of causality
The complex-valued transfer entropy  z z 1 2 → quantifies causality from z 1 to z 2 but cannot measure the significance of the causality.As such, we define the causality significance using a statistical test together with a shuffling strategy, which has been previously used in real-valued transfer entropy studies (Bossomaier et al., 2016).The shuffling process assists in eliminating spurious causality between z 1 and z 2 , ensuring the stability and accuracy of the causality measurement.Various transfer entropy differences between the original and shuffled signals are obtained by repeating the shuffling process (R times).Here, the number of times we perform shuffling, i.e., R, is set to 100.Then, one-sample t-test on the R causality differences is performed to detect the causality significance from z 1 to z 2 .
If we denote the shuffled transfer entropy as  z z is obtained in Eq. ( 11) as follows: is obtained in Eq. ( 12) as follows: includes all the transfer entropy differences between  z z 1 2  o    and  z z   2 1  o , we perform one sample t-test with the false discovery rate (FDR) correction as follows (Guo and Bhaskara, 2008): where ∆ is the mean of ∆, p th =0.05.We define the causal direction by the sign of ∆ as follows: 3 Experimental methods

Simulated signals
To evaluate the efficacy of CTE, we generate two sets of simulated complex-valued signals with linear and non-linear causality, respectively.Each set has three types of causal directions and is randomly generated 1,000 times and divided into 10 groups.
The baseline signals are generated using a widely used MATLAB toolbox named Granger causal connectivity analysis (GCCA) (Seth, 2010).The signals are generated with real-valued linear causality via an AR model as Seth (2010): where 3 ≤ ≤ t T, T is the data length and set to be 146 to keep the same data length as in the fMRI data.w t 1 and w t 2 are random variables with zero mean and unit variance satisfying normal distribution.The linear and non-linear causality with different causality cases can be obtained by exploiting and modifying the baseline signals defined in Eq. ( 15).When generating simulated signals with linear causality, the three types of simulated complex-valued signals are denoted as type L1, L2, and L3, respectively.The magnitude and phase of the two signals z 1 and z 2 from the three linear types are generated using Eqs.( 16)-( 18 where r t 1 and r t 2 are randoms without causality.
The non-linear causality can be obtained by adding quadratic and three-order terms to (Eq.15).The magnitude and phase of the three types (N1, N2, and N3) can be generated using Eqs.( 19)-( 21 Figure 2 shows the ground truth causal directions for the three types of two simulated complex-valued signals with non-linear and linear causality.Specifically, type L1/N1 has the complete complexvalued causality including magnitude-magnitude, phase-phase, and magnitude-phase; type L2/N2 has the incomplete complex-valued causality including magnitude-magnitude and phase-phase; type L3/ N3 only has magnitude-magnitude causality. Figure 3 presents example waveforms of simulated signals z 1 and z 2 from type L1 and type N1.The ground-truth causal direction for the magnitude and phase is shown in Figure 2A.We observe the peaks of the cause signals (z 1 magnitude and z 1 phase, in red) are ahead of the effect signals (z 2 magnitude and z 2 phase, in blue) in all cases, which are consistent with the causal direction of Figure 2. To test the noise effects on CTE, we also add Gaussian noise to the simulated signals with the signal-to-noise ratio (SNR) ranging from −10 dB to 10 dB.

Experimental fMRI data
The resting-state complex-valued fMRI data were a self-collected dataset from 80 subjects, including 40 healthy controls (HCs) and 40 patients with schizophrenia (SZs) with written subject consent overseen by the University of New Mexico Institutional Review Board.Specifically, there are 28 men and 12 women for HCs (mean age ± standard deviation: 36.25 ± 11.40) and 33 men and 7 women for SZs (mean age ± standard deviation: 40.73 ± 14.43).During the scan, all the participants were instructed to rest quietly in the scanner and keep their eyes open without sleeping and not to think of anything in particular (Lin et al., 2022).fMRI scans were acquired by a Siemens 3 T TIM Trio scanner equipped with a 12-channel head coil.The functional scan was acquired with the following parameters: TR = 2 s, TE = 29 ms, field of view = 24 cm, acquisition matrix = 64 × 64, flip angle = 75°, slice thickness = 3.5 mm, and slice gap = 1 mm.Data preprocessing was performed using the SPM software package. 1 Functional images were motion-corrected and then spatially normalized into the standard Montreal Neurological Institute space.Following spatial normalization, the data were resampled to 3 × 3 × 3 mm 3 , resulting in 53 × 63 × 46 voxels.Both magnitude and phase images were spatially smoothed with an 8 × 8 × 8 mm 3 full-width half-maximum (FWHM) Gaussian kernel.Phase images were first motion corrected using the transformations computed from magnitude-only data; then, complex division of phase data by the first time point reduced the need for phase unwrapping; and spatial normalization of phase images used the warp parameters computed from magnitude-only data.

Complex-valued time series of ROI
Brodmann area (BA) and anatomical automatic labeling (AAL) atlas are two commonly used references to divide the brain into ROIs for FC analysis.Compared with BA, AAL obtains more ROIs and  By dividing the 116 ROIs into these 10 networks, it is better to reveal the regularities of connections and establish relationships between FC and FNC.
The complex-valued time series for each ROI is expressed in Eq. ( 22) as follows: where x n t M x n t n t T , ,.., , ,.., 1 116 1 are the averaged magnitude and phase time series across all voxels within each ROI, and T denotes the total number of time points.The causality between any two ROIs can be quantified by CTE as ' x x n n 1 2 , ^` using Eqs.

Performance measures
In order to evaluate the proposed CTE, we compare it with the three real-valued causal analysis methods STE, HTE, and Granger, and one complex-valued approach, i.e., sCTE without considering magnitude and phase causality defined in Eq. ( 23) as follows: For the real-valued causal methods, both STE and HTE calculate real-valued TE  a b → between magnitudes in Eq. (3).Specifically, STE utilized the symbolic process in Eqs. ( 7) and ( 8) before estimating joint PDF using Eqs.( 9) and ( 10), while HTE estimates joint PDF without symbolic process.
Granger causal test is based on utilizing linear regression models to perform a statistical causality inference.Given two variables a and b, the autoregressive (AR) model of the Granger causal test is represented in Eq. ( 24) as follows: where u j , v j , and c j are the regression coefficients for the model, J is the estimated time delay between a and b, and µ t and η η t are two independent series satisfying Gaussian distribution.The fitting variances of using a t j − , b t j − to fit a t , and only using a t j − are denoted as V 2 a a b t t j t j |, |, and V 2 a a t t j | , respectively.The causal direction between a and b is judged by comparing the fitting variance using Eq. ( 25) as follows: As such, the causal direction is evaluated by Granger causality.
For simulated signals, we calculate the accuracy of directed inference in Eq. ( 26), denoted as AOC, as follows: where N correct is the number of correct causal direction judgments and N total is the total number of causality evaluations between two signals.
For experimental fMRI data, we first calculate the average Pearson correlation coefficient between the magnitude and phase from two (1) z1 magnitude and z2 magnitude, (2) z1 phase and z2 phase, (3) z1 magnitude and z2 phase, (4) z1 phase and z2 magnitude.
Second, we perform two-sample t-tests (p th = 0.05) on connections from HCs and SZs with the FDR correction (Guo and Bhaskara, 2008) to obtain significant intergroup differences in Eq. ( 28 Third, we compare the number of common and unique connections detected by each method.Finally, we compare the efficacy of the common and unique connections as features to classify HCs and SZs using support vector machine (SVM).The multilayer perceptron kernel is selected, and SVM is repeated 1,000 times.Given a training dataset of K1 subjects as x x represents the connectivity vector from the kth subject, M1 is the vector length, and y k is the label denoted as either 1 or − 1, indicating which class of x k belongs to.SVM aims to find a hyperplane to maximize the distance between the dataset and the hyperplane.The hyperplane can be represented in Eq. ( 29) as follows: where É and b are parameters of the hyperplane, and I x k is the kernel function.Multiple kernel functions can be used, e.g., the linear kernel, quadric kernel, and sigmoid kernel.By comparing the clustering performance, we select the multilayer perceptron (MLP) kernel for SVM and there are three layers including the input, hidden, and output layers.The input is the connectivity vectors x k , the non-linear activation function is tanh{}, ⋅ and the output of the MLP kernel is represented in Eq. ( 30) as follows (Suykens and Vandewalle, 1999): where É1 and b1 are weights and biases and are initially set to be 1 and −1, respectively.As such, the SVM classifier is built based on MLP kernel and can be realized by MATLAB built-in function named "mlp_kernel." The results are evaluated in terms of accuracy (ACC), sensitivity (SEN), and specificity (SPEC) defined in Eq. (31) as follows (Lin et al., 2022): where TP, TN, FP, and FN denote true positive, true negative, false positive, and false negative, respectively.To mitigate overfitting and guarantee reliability, leave one out cross-validation (LOOV) is performed.Specifically, LOOV leaves out the data from one subject as test data and exploits the data from the rest of the selected subjects for training.Given that, the test data are independent of the training data in each LOOV loop.LOOV is used for cross-validation purposes, given we have limited data.As such, we repeat the validation 1,000 times.

Simulated signals
Table 1 shows the accuracy of linear/non-linear directed inference for the three types of simulated signals without noise.Five types of directed analysis methods are compared including the proposed CTE, sCTE, and three real-valued methods: STE, HTE, and Granger.Compared with the other three transfer entropy methods (sCTE, STE, and THE), CTE obtains higher accuracy for causality inference, especially when having complete complex-valued causality (type L1/ N1).Specifically, for type N1 (non-linear signals containing complete complex-valued causality), CTE achieves slightly higher accuracy than the sCTE and 18.7-85.9%higher accuracy than the three realvalued algorithms.Figure 4 shows the estimated causality accuracy for simulated signals with different SNRs.CTE achieves the highest accuracy and noise robustness for type L1/N1, due to the consideration of complete complex-valued causality.For type L2/N2, CTE and sCTE yield higher accuracy than three real-valued methods, including STE, HTE, and Granger, especially with low SNR (<-6 dB), due to the inclusion of phase causality.Regarding type L3/N3, CTE and the other transfer entropy algorithms have similar accuracy with high SNR (> 6 dB) as there only has magnitude causality.In this case, CTE is a general method suitable for measuring linear and non-linear causality for both complex-valued and real-valued signals.For type L3, note that Granger shows higher directed accuracy than CTE with SNR being -4 dB-4 dB.The reason is that Granger is built on the AR model for linear causality, making it optimal when only magnitude causality exists.However, Granger fails to detect non-linear causality.Therefore, considering both linear and non-linear scenarios, the proposed CTE is the optimal-directed algorithm in most cases.

Experimental fMRI data
After performing a two-sample t-test (p < 0.05, df = 78, FDR corrected) for the connections between HCs and SZs, we compare the numbers of common and unique connections detected by two different methods with significant HC-SZ differences.We select sCTE and STE as comparison methods since they have better performance for the simulated signals (refer to Figure 4).
Figure 5 shows the number of common and unique connections in terms of ten brain networks.In total, CTE obtains more common connections with sCTE than with STE (505 vs. 344), while detecting fewer unique connections with sCTE than with STE (105 vs. 266).The reason is that sCTE is closer to CTE by considering additional phasephase causality relative to STE.Most of the common and unique connections belong to CER, which has been reported by previous studies to identify schizophrenia (Su et al., 2013;Watanabe et al., 2014).Other biomarker regions such as TEM, RFP, and visual areas (LV and MV) also show larger numbers of common and unique connections.
Table 2 shows common and unique connections between CTE and sCTE/STE with the top five significant HCs-SZs differences.These connections are mainly related to the brain networks including CER, RFP, and TEM, which are consistent with the abnormal connections of schizophrenia obtained by previous studies (Su et al., 2013;Watanabe et al., 2014;Oestreich et al., 2016;Maher et al., 2019;Dietz et al., 2020;Rashidi et al., 2021).Moreover, CTE detects unique connections with highly significant HCs-SZs differences related to RFP (vs.sCTE), DMN, and TEM (vs.STE).Considering the numbers and t-values of the connections with significant intergroup differences in Figure 5 and Table 2, CER, TEM, and RFP may be regarded as the biomarker brain networks for identifying schizophrenia (Su et al., 2013;Nenadic et al., 2014;Watanabe et al., 2014;Oestreich et al., 2016;Zhuo et al., 2018;Rashidi et al., 2021;Sklar et al., 2021).
Figure 6 shows the SVM classification accuracy.The features are the unique or common connections obtained by CTE, sCTE, and STE.CTE exhibits the highest accuracy (92.8%) using unique connections relative to sCTE, followed by STE.As these unique connections of CTE are mainly related to RFP and CER shown in Table 2, it suggests that RFP-and CER-related connections helps to classify HCs and SZs.After verifying CTE unique connections are helpful in classification, exploiting all the significant connections Common and unique connections between CTE and sCTE and between CTE and STE.The network with the maximum number of connections is highlighted in red.Li et al. 10.3389/fnins.2024.1423014Frontiers in Neuroscience 09 frontiersin.orgincluding common and unique connections for classification is evaluated in Table 3.For all the connections with significant intergroup differences, Table 3 shows the SVM performance measures (ACC, SEN, and SPEC) from five causality algorithms.As expected, the proposed complex-valued transfer entropy methods (CTE and sCTE) achieve better performance than real-valued directed analysis methods.CTE shows the best classification performance among all the five directed analysis methods; e.g., it improves higher ACC with 10.2% (95.5% vs. 85.3%) to sCTE, 13.6% (95.5% vs. 81.9%) to STE, 18.7% (95.5% vs. 76.8%) to HTE, and 20.9% (95.5% vs. 74.6%) to Granger, respectively.The proposed CTE obtains all the highest values of the three classification measures, especially for SEN reaching to 96.3%.This suggests that CTE captures meaningful and discriminative features to identify HCs and SZ.
Several studies have employed SVM for classifying HCs and SZs, especially using FC as features.In terms of using SVM for HCs and SZs classification, we select the previous studies with similar data sizes of the dataset in the paper (40 HCs and 40 SZs) for comparison.Su et al. (2013) performed SVM to FC quantified by an extended maximal information coefficient and obtained 82.8% clustering accuracy (32 HCs and 32 SZs).By analyzing the coherence regional homogeneity value, Liu et al. (2018) demonstrated that the abnormal connections related to TEM, insula, precentral gyrus, and precuneus can be used as psychosis biomarker of schizophrenia and achieved 89.9% accuracy (31 HCs and 48 SZs).Following this, Bae et al. (2018) pointed to decreased connections in the global and local network connectivity in SZs compared with HCs, especially in DMN, left parietal region, and TEM with an accuracy of 92.1% (31 HCs and 48 SZs).Instead of using FC of magnitude data for classification, Li et al. (2024) utilized dynamic connectivity features of phase maps as features for classification and obtained 87.5% accuracy (24 HCs and 24 SZs).As mentioned above, the existing studies of real-valued connections achieved 82.8-92.1% SVM accuracy for classifying HCs and SZs.Due to making full use of both magnitude and phase fMRI data, directed FC quantified by CTE shows higher classifying accuracy (95.5%) than the previous studies with similar data sizes.

Discussion
To our knowledge, few studies have explored directed FC based on complex-valued fMRI data, although directed FC has been increasingly studied using magnitude-only fMRI data.In this study, we propose a non-linear complex-valued directed analysis method based on transfer entropy to make full use of complex-valued fMRI data in highlighting differences between HCs and SZs.Simulated results show that our method has the highest accuracy and noisy robustness, especially for the non-linear model with complete complex-valued causality containing magnitude-magnitude, phasephase, and magnitude-phase relationship.Experimental results show that CTE detects more unique connections with higher significant intergroup differences, thus leading to better performance in classifying HCs and SZs.Instead of directly quantifying magnitude-phase causality, we propose to introduce partial transfer entropy to exploit the complementary phase/magnitude effects on magnitude-phase and phase-magnitude causality.This is because partial CTE can simultaneously utilize both magnitude and phase to assess causality in Eqs. ( 4) and ( 5), while transfer entropy only considers magnitudephase or phase-magnitude dependence without the complementary phase/magnitude effects in Eqs. ( 32) and (33) as follows: As such, we use transfer entropy  boT T and  aoM M to replace partial entropy  a b o T T and  T T M M o a for comparison when calculating the proposed CTE.
Figure 7 shows directed accuracy for the simulated signals with linear and non-linear complete complex-valued causality (type L1 and N1).It presents that using partial transfer entropy (shorted as partial TE) to measure the complementary phase and magnitude effects on magnitude-phase causality shows better performance than those directly quantifying magnitude-phase causality using transfer entropy (TE), especially for quantifying the linear causality.When measuring the causality between magnitude and phase, partial TE considers more information, thus partial TE enhancing CTE noise robustness.It verifies the effectiveness of introducing partial TE in the proposed CTE definition.
To evaluate the data length effects on CTE, we change the simulated data length from 100 to 1,000 time points.CTE can keep high causality inference accuracy, especially for the signals with complete complex-valued causality (type L1 and N1).Because CTE keeps high causality inference accuracy to different data lengths, we can combine CTE and a sliding window approach for dynamic analysis.By performing causality analysis on the segmented time series, directed FC from different windows can be obtained.As such, dynamic statistical analysis can be exploited to analyze dynamics from the directed FC.
Figure 8 shows the average Pearson correlation coefficients between magnitude and phase from two different ROI signals across all the subjects in each group.The magnitude-phase correlation coefficients range from −0.2 to 0.2.This supports the magnitudephase causality considered in the proposed method.Specifically, CER-related connections are marked with black boxes and locally magnified.There are polarity and strength differences between HCs and SZs in both magnitude-phase and phase-magnitude correlation coefficients.This suggests that the proposed complex-valued transfer entropy considering causality between magnitude-phase and phasemagnitude is essential and can capture more intergroup differences.
CER-related connections show more and higher significant intergroup differences obtained by the proposed CTE.Although the cerebellum has been reported to be associated with the motor system, a growing number of studies have found that the cerebellum is critical to processing complex functions, e.g., attention, cognition, and language (Lungu et al., 2013).Lungu et al. reviewed 234 fMRI studies published from 1997 to 2010 related to SZs and pointed out that 41.02% of the articles reported cerebellar activity related to cognitive, emotional, and executive processes in schizophrenia.In conclusion, the results of their analyses suggest that the cerebellum plays an essential functional role in schizophrenia, especially in the cognitive and executive domains.Following this, we performed searches in the abstracts of articles indexed in Scopus from 2011 to 2023 and found 218 articles that reported abnormal cerebellum-related connections in schizophrenia.These studies proved that the cerebellum is a functional hub involved in cognition, language, and emotional processing with regions, including TEM, DMN, and visual areas.For instance, Table 2 highlights Cerebelum_6_R (AAL No.100) has connections with significant HC-SZ difference, which is also consistent with previous studies.Su et al. (2013) quantified non-linear undirected connections and pointed out that cerebellum-related ROIs, especially CRBL6.R, were important in identifying schizophrenia.Zhuo et al. (2018) calculated FC density to investigate cerebellar connectivity changes of SZs and found abnormal connectivity strength of Cerebelum_6_R with visual areas (Zhuo et al., 2018).Causal inference accuracy comparison for CTE that using partial transfer entropy (shorted as partial TE) and TE, respectively.The proposed CTE using partial TE shows higher accuracy than the CTE that uses TE to directly quantify magnitude-phase causality. 10.3389/fnins.2024.1423014 Frontiers in Neuroscience 11 frontiersin.org Apart from CER, CTE also detects common connections related to visual areas (MV and LV), and temporal lobe (TEM) in Table 2.These two nodes have brain functions of vision and auditory, respectively.As hallucinations are a frequent symptom of schizophrenia including visual and auditory hallucinations occupying 70% of patients with schizophrenia (Demirci et al., 2008), it is expected that MV and TEM are schizophrenia-related in terms of pathology mechanisms (Fogelson et al., 2014;Dietz et al., 2020).For unique connections detected by CTE in Table 3, abnormal connectivity mainly related to RFP is verified by previous studies.Frontal parietal regions have been shown involved in the cognitive and perceptive process (Smith et al., 2009) and are highly related to the impaired cognitive function of SZs (Roiser et al., 2013).Roiser et al. pointed out that connective abnormality related to the frontal-parietal areas may link to cognitive impairment for SZs (Roiser et al., 2013), given that the unique abnormal connectivity patterns obtained by CTE may provide additional evidence for the cognitive and perceptive impairments of schizophrenia.
In addition to FC between ROI, CTE can also measure the FNC of brain networks.We use CTE to quantify the FNC of brain networks.Table 4 shows the SVM performance of the five directed analysis methods.Similar to FC results, CTE also shows the best performance among these methods.Compared with other directed analysis methods, CTE shows better classification performance, e.g., improves higher accuracy with 5.1% (88.2% vs. 83.1%) to sCTE, 9.9% (88.2% vs. 78.3%) to STE, 17.5% (88.2% vs. 70.7%) to HTE, and 16.9% (88.2% vs. 71.3%) to HTE, respectively.
In future, our CTE approach can be extended to analyzing causal FNC of time courses extracted by blind source separation, e.g., ICA, sparse representation, and tensor decomposition.Second, dynamic-directed FC/FNC can be performed to further improve classification performance.Finally, CTE can be exploited for other mental disorders such as depressive disorder or further extended to other applications for evaluating complex-valued causality.
FIGURE 2Ground truth casual direction for three types of simulated linear and non-linear signals.The arrow represents the causal direction.
FIGURE 4Causality accuracy for simulated signals with linear and non-linear causality under different SNRs.

FIGURE 8
FIGURE 8Average Pearson's correlation coefficients for (A) magnitude-phase and (B) phase-magnitude of complex-valued time series from 116 AAL ROIs across subjects in HCs or SZs.
Simulated data verify the high accuracy of CTE compared to a simplified CTE (sCTE) without magnitude-phase causality and the three real-valued methods, including STE, HTE, and Granger causal test.2. We evaluate the significance of the non-linear directed connectivity via a one-sample t-test by using a shuffling strategy of two complex-valued signals.The statistical test assists in eliminating spurious causality, ensuring the stability and accuracy of the causality measurement.3. We analyze directed FC using experimental resting-state complex-valued fMRI data from 40 schizophrenia patients and 40 healthy controls.CTE yields results consistent with previous findings but with more significant group differences, detects new directed connectivity, and achieves higher SVM classification accuracy, compared to sCTE, STE, HTE, and Granger causal test.

TABLE 1
Comparison of the mean and standard deviation of the accuracy of causality inference by five methods for simulated signals without noise.

TABLE 2
Common and unique connections with the top five HCs-SZs significance.

TABLE 3
SVM classification is performed by combining unique connections and common connections.

TABLE 4
SVM classification is performed based on the directed FNC of brain networks.